Search CORE

108 research outputs found

Stationary Mixing Bandits

Author: Audiffren Julien
Ralaivola Liva
Publication venue
Publication date: 05/06/2014
Field of study

We study the bandit problem where arms are associated with stationary phi-mixing processes and where rewards are therefore dependent: the question that arises from this setting is that of recovering some independence by ignoring the value of some rewards. As we shall see, the bandit problem we tackle requires us to address the exploration/exploitation/independence trade-off. To do so, we provide a UCB strategy together with a general regret analysis for the case where the size of the independence blocks (the ignored rewards) is fixed and we go a step beyond by providing an algorithm that is able to compute the size of the independence blocks from the data. Finally, we give an analysis of our bandit problem in the restless case, i.e., in the situation where the time counters for all mixing processes simultaneously evolve

arXiv.org e-Print Archive

HAL AMU

From Cutting Planes Algorithms to Compression Schemes and Active Learning

Author: Louche Ugo
Ralaivola Liva
Publication venue
Publication date: 12/07/2015
Field of study

Cutting-plane methods are well-studied localization(and optimization) algorithms. We show that they provide a natural framework to perform machinelearning ---and not just to solve optimization problems posed by machinelearning--- in addition to their intended optimization use. In particular, theyallow one to learn sparse classifiers and provide good compression schemes.Moreover, we show that very little effort is required to turn them intoeffective active learning methods. This last property provides a generic way todesign a whole family of active learning algorithms from existing passivemethods. We present numerical simulations testifying of the relevance ofcutting-plane methods for passive and active learning tasks.Comment: IJCNN 2015, Jul 2015, Killarney, Ireland. 2015, \<http://www.ijcnn.org/\&g

arXiv.org e-Print Archive

Crossref

HAL AMU

Confusion Matrix Stability Bounds for Multiclass Classification

Author: Machart Pierre
Ralaivola Liva
Publication venue
Publication date: 01/01/2012
Field of study

In this paper, we provide new theoretical results on the generalization properties of learning algorithms for multiclass classification problems. The originality of our work is that we propose to use the confusion matrix of a classifier as a measure of its quality; our contribution is in the line of work which attempts to set up and study the statistical properties of new evaluation measures such as, e.g. ROC curves. In the confusion-based learning framework we propose, we claim that a targetted objective is to minimize the size of the confusion matrix C, measured through its operator norm ||C||. We derive generalization bounds on the (size of the) confusion matrix in an extended framework of uniform stability, adapted to the case of matrix valued loss. Pivotal to our study is a very recent matrix concentration inequality that generalizes McDiarmid's inequality. As an illustration of the relevance of our theoretical results, we show how two SVM learning procedures can be proved to be confusion-friendly. To the best of our knowledge, the present paper is the first that focuses on the confusion matrix from a theoretical point of view

arXiv.org e-Print Archive

CiteSeerX

HAL AMU

Decoy Bandits Dueling on a Poset

Author: Audiffren Julien
Liva Ralaivola
Publication venue
Publication date: 05/02/2016
Field of study

We adress the problem of dueling bandits defined on partially ordered sets, or posets. In this setting, arms may not be comparable, and there may be several (incomparable) optimal arms. We propose an algorithm, UnchainedBandits, that efficiently finds the set of optimal arms of any poset even when pairs of comparable arms cannot be distinguished from pairs of incomparable arms, with a set of minimal assumptions. This algorithm relies on the concept of decoys, which stems from social psychology. For the easier case where the incomparability information may be accessible, we propose a second algorithm, SlicingBandits, which takes advantage of this information and achieves a very significant gain of performance compared to UnchainedBandits. We provide theoretical guarantees and experimental evaluation for both algorithms

arXiv.org e-Print Archive

HAL AMU

Confusion-Based Online Learning and a Passive-Aggressive Scheme

Author: Ralaivola Liva
Publication venue: HAL CCSD
Publication date: 03/12/2012
Field of study

International audienceThis paper provides the first ---to the best of our knowledge--- analysis of online learning algorithms for multiclass problems when the {\em confusion} matrix is taken as a performance measure. The work builds upon recent and elegant results on noncommutative concentration inequalities, i.e. concentration inequalities that apply to matrices, and, more precisely, to matrix martingales. We do establish generalization bounds for online learning algorithms and show how the theoretical study motivates the proposition of a new confusion-friendly learning procedure. This learning algorithm, called \copa (for COnfusion Passive-Aggressive) is a passive-aggressive learning algorithm; it is shown that the update equations for \copa can be computed analytically and, henceforth, there is no need to recourse to any optimization package to implement it

HAL AMU

Unconfused Ultraconservative Multiclass Algorithms

Author: Louche Ugo
Ralaivola Liva
Publication venue
Publication date: 13/11/2013
Field of study

We tackle the problem of learning linear classifiers from noisy datasets in a multiclass setting. The two-class version of this problem was studied a few years ago by, e.g. Bylander (1994) and Blum et al. (1996): in these contributions, the proposed approaches to fight the noise revolve around a Perceptron learning scheme fed with peculiar examples computed through a weighted average of points from the noisy training set. We propose to build upon these approaches and we introduce a new algorithm called UMA (for Unconfused Multiclass additive Algorithm) which may be seen as a generalization to the multiclass setting of the previous approaches. In order to characterize the noise we use the confusion matrix as a multiclass extension of the classification noise studied in the aforementioned literature. Theoretically well-founded, UMA furthermore displays very good empirical noise robustness, as evidenced by numerical simulations conducted on both synthetic and real data. Keywords: Multiclass classification, Perceptron, Noisy labels, Confusion MatrixComment: ACML, Australia (2013

arXiv.org e-Print Archive

HAL AMU

Chromatic PAC-Bayes Bounds for Non-IID Data: Applications to Ranking and Stationary $\beta$ -Mixing Processes

Author: Ralaivola Liva
Stempfel Guillaume
Szafranski Marie
Publication venue
Publication date: 09/09/2009
Field of study

Pac-Bayes bounds are among the most accurate generalization bounds for classifiers learned from independently and identically distributed (IID) data, and it is particularly so for margin classifiers: there have been recent contributions showing how practical these bounds can be either to perform model selection (Ambroladze et al., 2007) or even to directly guide the learning of linear classifiers (Germain et al., 2009). However, there are many practical situations where the training data show some dependencies and where the traditional IID assumption does not hold. Stating generalization bounds for such frameworks is therefore of the utmost interest, both from theoretical and practical standpoints. In this work, we propose the first - to the best of our knowledge - Pac-Bayes generalization bounds for classifiers trained on data exhibiting interdependencies. The approach undertaken to establish our results is based on the decomposition of a so-called dependency graph that encodes the dependencies within the data, in sets of independent data, thanks to graph fractional covers. Our bounds are very general, since being able to find an upper bound on the fractional chromatic number of the dependency graph is sufficient to get new Pac-Bayes bounds for specific settings. We show how our results can be used to derive bounds for ranking statistics (such as Auc) and classifiers trained on data distributed according to a stationary {\ss}-mixing process. In the way, we show how our approach seemlessly allows us to deal with U-processes. As a side note, we also provide a Pac-Bayes generalization bound for classifiers learned on data from stationary

\varphi

-mixing distributions.Comment: Long version of the AISTATS 09 paper: http://jmlr.csail.mit.edu/proceedings/papers/v5/ralaivola09a/ralaivola09a.pd

arXiv.org e-Print Archive

HAL Evry

HAL AMU

Multiple Subject Learning for Inter-Subject Prediction

Author: Ralaivola Liva
Takerkart Sylvain
Publication venue: HAL CCSD
Publication date: 04/06/2014
Field of study

International audienceMulti-voxel pattern analysis has become an important tool for neuroimaging data analysis by allowing to predict a behavioral variable from the imaging patterns. However, standard models do not take into account the differences that can exist between subjects, so that they perform poorly in the inter-subject prediction task. We here introduce a model called Multiple Subject Learning (MSL) that is designed to effectively combine the information provided by fMRI data from several subjects; in a first stage, a weighting of single-subject kernels is learnt using multiple kernel learning to produce a classifier; then, a data shuffling procedure allows to build ensembles of such classifiers, which are then combined by a majority vote. We show that MSL outperforms other models in the inter-subject prediction task and we discuss the empirical behavior of this new model

Crossref

HAL AMU

MKPM: A multiclass extension to the kernel projection machine

Author: Ralaivola Liva
Takerkart Sylvain
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 21/06/2011
Field of study

International audienceWe introduce Multiclass Kernel Projection Machines (MKPM), a new formalism that extends the Kernel Projection Machine framework to the multiclass case. Our formulation is based on the use of output codes and it implements a co-regularization scheme by simultaneously constraining the projection dimensions associated with the individual predictors that constitute the global classiﬁer. In order to solve the optimization problem posed by our formulation, we propose an efﬁcient dynamic programming approach. Numerical simulations conducted on a few pattern recognition problems illustrate the soundness of our approach

HAL AMU

Differentially Private Sliced Wasserstein Distance

Author: Rakotomamonjy Alain
Ralaivola Liva
Publication venue
Publication date: 01/07/2021
Field of study

Developing machine learning methods that are privacy preserving is today a central topic of research, with huge practical impacts. Among the numerous ways to address privacy-preserving learning, we here take the perspective of computing the divergences between distributions under the Differential Privacy (DP) framework -- being able to compute divergences between distributions is pivotal for many machine learning problems, such as learning generative models or domain adaptation problems. Instead of resorting to the popular gradient-based sanitization method for DP, we tackle the problem at its roots by focusing on the Sliced Wasserstein Distance and seamlessly making it differentially private. Our main contribution is as follows: we analyze the property of adding a Gaussian perturbation to the intrinsic randomized mechanism of the Sliced Wasserstein Distance, and we establish the sensitivityof the resulting differentially private mechanism. One of our important findings is that this DP mechanism transforms the Sliced Wasserstein distance into another distance, that we call the Smoothed Sliced Wasserstein Distance. This new differentially private distribution distance can be plugged into generative models and domain adaptation algorithms in a transparent way, and we empirically show that it yields highly competitive performance compared with gradient-based DP approaches from the literature, with almost no loss in accuracy for the domain adaptation problems that we consider

arXiv.org e-Print Archive

HAL - Normandie Université